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1 ATGGGGGAGA TGCAGGGCGC GCTGGCCAGA GCCCGGCTCG AGTCCCTGCT 
51 GCGGCCCCGC CACAAAAAGA GGGCCGAGGC GCAGAAAAGG AGCGAGTCCT 
101 TCCTGCTGAG CGGACTGGCT TTCATGAAGC AGAGGAGGAT GGGTCTGAAC 
151 GACTTTATTC AGAAGATTGC CAATAACTCC TATGCATGCA AACACCCTGA 
201 AGTTCAGTCC ATCTTGAAGA TCTCCCAACC TCAGGAGCCT GAGCTTATGA 
251 ATGCCAACCC TTCTCCTCCA CCAAGTCCTT CTCAGCAAAT CAACCTTGGC 
301 CCGTCGTCCA ATCCTCATGC TAAACCATCT GACTTTCACT TCTTGAAAGT 
351 GATCGGAAAG GGCAGTTTTG GAAAGGTTCT TCTAGCAAGA CACAAGGCAG 
401 AAGAAGTGTT CTATGCAGTC AAAGI 1 1 I AC AGAAGAAAGC AATCCTGAAA 
451 AAGAAAGAGG AGAAGCATAT TATGTCGGAG CGGAATGTTC TG7TGAAGAA 
501 TGTGAAGCAC CCTTTCCTGG TGGGCCTTCA CTTCTCTTTC CAGACTGCTG 
551 ACAAATTGTA CTTTGTCCTA GACTACATTA ATGGTGGAGA GTTGTTCTAC 
601 CATCTCCAGA GGGAACGCTG CTTCCTGGAA CCACGGGCTC GTTTCTATGC 
651 TGCTGAAATA GCCAGTGCCT TGGGCTACCT GCATTCACTG AACATCGTTT 
701 ATAGAGACTT AAAACCAGAG AATATTTTGC TAGATTCACA GGGACACATT 
751 GTCCTTACTG ACTTCGGACT CTGCAAGGAG AACATTGAAC ACAACAGCAC 
801 AACATCCACC TTCTGTGGCA CGCCGGAGTA TCTCGCACCT GAGGTGCTTC 
851 ATAAGCAGCC TTATGACAGG ACTGTGGACT GGTGGTGCCT GGGAGCTGTC 
901 TTGTATGAGA TGCTGTATGG CCTGCCGCCT TTTTATAGCC GAAACACAGC 
951 TGAAATGTAC GACAACATTC TGAACAAGCC TCTCCAGCTG AAACCAAATA 
1001 TTACAAATTC CGCAAGACAC CTCCTGGAGG GCCTCCTGCA GAAGGACAGG 
1051 ACAAAGCGGC TCGGGGCCAA GGATGACTTC ATGGAGATTA AGAGTCATGT 
1101 CTTCTTCTCC TTAATTAACT GGGATGATCT CATTAATAAG AAGATTACTC 
1151 CCCCTTTTAA CCCAAATGTG AGTGGGCCCA ACGACCTACG GCACTTTGAC 
1201 CCCGAGTTTA CCGAAGAGCC TGTCCCCAAC TCCATTGGCA AGTCCCCTGA 
1251 CAGCGTCCTC GTCACAGCCA GCGTCAAGGA AGCTGCCGAG GCTTTCCTAG 
1301 GCTTTTCCTA TGCGCCTCCC ACGGACTCTT TCCTCTGA (SEQ ID N0:1) 

FEATURES: 
Start Codon: 1 
Stop Codon: 1336 



Homologous proteins: 

TOP 10 BLAST Hits 

CRA | 67000082668077 gi 114756346 /def=ref|XP_037046.1| (XM.. 
CRA | 18000005074572 gi | 5032091 /def=ref | NP_005618 . 1 | (NM_. . 
CRA | 18000004907445 gi 1477098 /def=pir| |A48094 serum and .. 
CRA 1 150000165029864 gi 113431833 /def=sp|Q9XTl8|SGlC_RABrr. . 
CRA | 18000005246968 gi 16755490 /def=ref |NP_035491.1| (NM_. . 
CRA 1 18000004937507 gi 19507093 /def=ref | NP_062105 . 1 | (NM_. . 
CRA | 18000005171986 gi 13688803 /def=gb|AAC62398.1| (AF057.. 
CRA | 18000005144813 gi | 3116066 /def=emb | CAA11528 . 1 | (AJ22 . . 
CRA | 18000005144812 gi | 3116064 /def=emb | CAA11527 . 1 | (AJ22 . . 
CRA | 335001098677651 gi 111321321 /def=gb|AAG34115.1|AF312. . 
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Blast hits to dbEST: 

CRA Number qi Number Score Expect 



CRA 


63000075619018 


gi 


114816226 


1562 


0.0 




CRA| 


225000014874111 


gi 


118504859 


1503 




0.0 


CRA 


225000001750119 


gi 


115756574 


1441 




0.0 


CRA 
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gi 
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1417 




0.0 
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CRA 1 222000001550936 gi 1 15446747 1021 0 . 0 

CRA | 78000105667113 gi 110352757 1017 0.0 

CRA 1 3000001081732 gi 12877701 1009 0.0 

CRA | 63000075437648 gi 114804614 1003 0.0 

CRA | 114000006533071 gi 18142744 985 0.0 

CRA 1 149000089175289 gi 113402435 981 0.0 

CRA 1 155000041341862 gi 1 10144531 977 0.0 

CRA 1 1000490895790 gi 15433043 954 0.0 

EXPRESSION INFORMATION FOR MODULATORY USE: 

library source: 

gi Number Organ Tissue Type 

gi 1 13994615 brai n hypothal amus 

gi | 10999153 PLACE! pi acenta 

gi I 10947466 MAMMAl mammary gland 

gi I 10996930 PLACEl placenta 

gi 115438670 brain. hippocampus 
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1 MGEMQGALAR ARLESLLRPR HKKRAEAQKR SESFLLSGLA FMKQRRMGLN 
51 DFIQKIANNS YACKHPEVQS ILKISQPQEP ELMNANPSPP PSPSQQINLG 
101 PSSNPHAKPS DFHFLKVIGK GSFGKVLLAR HKAEEVFYAV KVLQKKAILK 
151 KKEEKHIMSE RNVLLKNVKH PFLVGLHFSF QTADKLYFVL DYINGGELFY 
201 HLQRERCFLE PRARFYAAEI ASALGYLHSL NIVYRDLKPE NILLDSQGHI 
251 VLTDFGLCKE NIEHNSTTST FCGTPEYLAP EVLHKQPYDR TVDWWCLGAV 
301 LYEMLYGLPP FYSRNTAEMY DNILNKPLQL KPNTTNSARH LLEGLLQKDR 
351 TKRLGAKDDF MEIKSHVFFS LINWDDLINK KTTPPFNPNV SGPNDLRHFD 
401 PEFTEEPVPN SIGKSPDSVL VTASVKEAAE AFLGFSYAPP TDSFL (SEQ ID NO: 2) 



FEATURES : 

Functional domains and key regions: 
Prosite results: 

PDOC00001 PS00001 ASNJGLYCOSYLATION 
N-glycosylation site > 
Number of matches: 4 - 

1 58-61 NNSY 

2 265-268 NSTT 

3 333-336 NTTN 

4 389-392 NVSG 



PDOC00004 PS00004 CAMP_PHOSPHO_3ITE 

CAMP- and cGMP-dependent protein kinase phosphorylation site 
380-383 KKIT 

PDOC00005 PS00005 PKC_PH0SPH0_5nT£ 
Protein kinase C phosphorylation site 
Number of matches: 4 

1 159-161 SER 

2 337-339 SAR 

3 351-353 TKR 

4 424-426 SVK 



PDOC00006 PS00006 CK2_PHOSPHO_SITE 
Casein kinase II phosphorylation site 
424-427 SVKE 

PDOC00007 PS00007 TYR_PHOSPH0l33TE 
Tyrosine kinase phosphorylation site 

130-138 RHKAEEVFY 



FIGURE 2A 



Docket No.: CL001313-DIV 
Serial No.: (to be assigned) 
Inventors: Chunhua YAN et al. 
Title: ISOLATED HUMAN KINASE... 



PDOC00008 PS00008 MYRISTYL 

N-myristoylation site 

175-180 GLHFSF 

PDOC00100 PS00107 PROTECNJOENASE-ATP 
Protein kinases ATP-binding region signature 

118-150 IGKGSFGKVLLARHKAEEVFYAVKVLQKKAILK 

PDOC00100 PS00108 PRCTEINJaMASEJST 
Serine/Threonine protein kinases active-site signature 
232-244 IVYRDLKPENILL 



Membrane spanning structure and domains: 
Helix Begin End Score Certainty 
1 293 313 0.948 Putative 

BLAST Alignment to Top Hit: 

>CRA 1 67000082668077 gi 114756346 /def=ref |XP_037046.1| 

(XNL.037046) serum/glucocorticoid regulated kinase [Homo 
sapiens] /org=Homo sapiens /taxon=9606 /div=PRl 
/dataset=nraa /length=431 
Length = 431 

Score = 843 bits (2154), Expect = 0.0 

Identities = 406/407 (99%), Positives = 407/407 (99%) 

Frame = +1 

Query: 292 LAFM<QRRMGLNDnQKIANNSYAO<H^ 471 

+AFWQRRMGUVIDFIQKIANNSYACKHP 
Sbjct: 25 ]Ar^QRRI^LNDFIQKIANNSYA<^PEVQS^^ 84 

Query: 472 LGPSSNPHAKPSDFHFLKVIGKGSFGKVLU\RHK^ 651 

LGPSSNPHAKPSDFHFLKVIGKGSFGKVLLARHKAEEVFYANM 
Sbjct: 85 LGPSSNPHAKPSDRHFLKVIGKGSFGKVLLARHKAEEVFYAN^^ 144 

Query: 652 SERWLLKJWKHPFLVGLHFSFQTAD^ 831 

SERNVLLKNVKHPFl.VGLHFSRTrADKLYFVIJ)YINGGELFY^ 
Sbjct: 145 SERWLLKNVKHPFLVGLHFSRFADKLYF^ 204 

Query: 832 EIASALGYLHSLNTAA^LKPENILmSQGHIVLTDreLCKENimNSTTSTFCGTPEYL 1011 

EIASALGYLHSLNTA/YRDLKPENILIJJSQGHIVL^ 
Sbjct: 205 EIASALGYLHSLNIVYRDLKPENILmSQGH^LTDreLCKB^IEHNSTTSTFCGTPEYL 264 
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Query: 1012 APEVWKQPYDRTVDV^LGAVLYEM^ 1191 

APEVWKQPYDr?TVL>/W^ 
Sbjct: 265 APEVWKQPYDRTVD^LGAVLYEMLYGL^^ 324 

Query: 1192 RHLLEGLLOJ<DRTKRLGAKDDFWEIKSHVFFSLIN^^ 1371 

RHLLEGLLOKDRTKRLGAKDDFMEIKSHVFFSL^^ 
Sbjct: 325 RHLLEGLLOJORTKRLGAKDDFMEIKSHVFFSLI^^ 384 

Query: 1372 FDPEFTEEPVPNSIGKSPDSVLVTASVKEAAEAFLGFSYAPPTDSFL 1512 

FDPEFTEEPVPNSIGKSPDSVLVrASV/KEAAEAFLGFSYAPPTDSFL 
Sbjct: 385 FDPEFTEEPVPNSIGKSPDSVLVTASVKEAAEAFLGFSYAPPTDSFL 431 (SEQ ID NO: 4) 



Hmmer search results (Pfam): 

Scores for sequence family classification (score includes all domains): 



Model 


Descriotion 


Score 


E-value 


N 


PF00069 


Eukaryotic protein kinase domain 


298.1 


l.le-85 


1 


PF00433 


Protein kinase C terminal domain 


56.0 


5.6e-16 


1 


CE00022 


CE00022 MAGUK_subfamily_d 


24.7 


3.5e-07 


1 


CE00359 


E00359 bonejnorphogeneti c_protei n_reCeptor 


14.6 


0.0019 


2 


PF00787 


PX domain 


7.2 


2.4 


1 


CE00031 


CE00031 VEGFR 


0.6 


2.7 


1 


CE00292 


CE00292 PTKjnemb rane_span 


-49.8 


3.8e-06 


1 


CE00287 


CE00287 PTK_Eph_orphan_receptor 


-53.1 


7.6e-05 


1 


CE00289 
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-67.6 


0.37 
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0.00097 
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-88.2 
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1 


CE00290 


CE00290 PTK_Trk_family 


-153.3 


9.1e-05 


1 


CE00016 


CE00016 GSK_glycogen_synthase_kinase 


-159.4 


8.3e-08 


1 



Parsed for domains: 



Model 
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.] 


7.2 


2.4 


CE00359 


1/2 


115 


143 .. 


144 


175 
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1 
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0.6 


2.7 
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1 TCTGGCTCGT GCTCTCATGT GATCTCAGAG TTCCAGCTTA TCAGAGGCAT 
51 GTAGCAGGGA GGCTTATTCC AGCCATAACT GGGCTCTACC TCCAGCCTCC 
101 AGAAGTAATC CCCAACCTGC ATATCCTTGG GCAACCCGAA GAATGAAAGA 
151 AGAAGCTATA AAACCCCCTT TGAAAGGTTC GTACTTACCG TACTATATTT 
201 TGCAGATGCC TCAAAGGATT TGGGGTTACT TGGCATGGGG AAGGCACATA 
251 AGGTGGGGTG TAGGAGAGGG TCTCTGGTTG TAGGTTTCTT AATTTAATGT 
301 TTGAAAACAA ACATGCAAAA GTCTGTGTGC AGGTTGATGT TTCTGGGCAG 
351 CCTGAGCAAA ATTTGCTCTC TCAAGAGGGA AAGGAACCAG GTGGGAGCAG 
401 AGCTAGGCTG GGCTAGGCTA GTTGAATGGT GGGACATGAC ATACGGGTGG 
451 CACTGGCAAT AACAAAGTCA CATTCTATGA AGATTCCCTG CAAGAGGAAG 
501 CAGACATGGG CCAGTTACTG TGATTTGAAA TTGCCTAAAC ATTGCTTTAG 
551 GTTGGCATGT CAATTTCAGG TACTAGTGTT 1 1 1 1 1 IGI 1 1 TTGTTTTTGT 
601 I I IGI 1 1 1 IG 1 1 IGI I IGI I TGI 1 1 1 GAGA CGGAGTCTCG CTCTGTTGCC 
651 AGGCTGGAGT GCAGTGGCGT GATCTCGGCT CACTGCAACC TCCGCCTCCC 
701 GGGTTCAAGC GATTCTCCTG CCTCAGCCTC CCGAGTAACT GGGACTACAG 
751 GCGCACGCCA CCACGCCTGG CTAAI 1 1 1 IC TATTTTCAGT AGAGACGGGG 
801 TTTCACCATG TTAGCCAGGA TGGTCTCGAT CTCTTGACCT CGTGATCCGC 
851 CCGCCTTGGC CTCCCAAAGT GCTGGGATTA CAGGCGTGAG CCACTGCGCC 
901 . CGGCCCCAGT AAATGCTTTT TATAAGTGTG GGCACTGAGC AAACTTTCCC 
951 AGCCAGACTC CAGGAGAGAG AATGTGTTTC CCTTCTCTCG GTTTGGGGCT 
1001 GTTGCAACAA AGCAAACCAA GGAGTTGAGA CTAGAGCTCA CTTTAGGGCA 
1051 AGTGGGGGTG GTTTTGCCTG CAAAACAAAC CCCTGCCCAA GACCAAGGAA 
1101 AAGGCGTTTC ACATGCTATT CCTGGTTTGA CAGCTGGTAT TTCGGGACTG 
1151 TGCCAGATCC AGTAGGCAAC TTTAAAATGG CAGAGCCTTT GGTAGCAAGA 
1201 GGTCATGGCA GGGCAGCCAC CGCAGACAGC AACAGCGAGC GCCAGGTACC 
1251 TGGCCCTGCG AATAGTGGTA ACTTGTAACT GCCCGCTCCG GGCCCAGTCG 
1301 CTGTGCTCGC GGCTTCCCGG CCAGCACTGG CTCACGTCCC CGCGCCGGCG 
1351 GTCAGGCTGC GGCTCCCAGA CATCCCCCAG CCGCGGGGTT ACTGGAAGGC 
1401 ACCGGCATCG CTGTTCTGCA GAGCCCGGGC CGCCGCCTCG AGCTTCCCTC 
1451 TCTTCCCTGC CTTCTGCAGC GGAGTCACCC GGCTAATCTT TCAGGATAAA 
1501 GTCACAGTTT ATGTGGGACT CACATAAAGA GCGAGCGAGG TGGCAAAACT 
1551 AAGAAGCCCT GGGGCAGCCT TGAGTTAAAC CCAGGGAGGG TAGGGACGAT 
1601 TTTAAGACCA TGTATCATGA CCTGCAGGGT TTTCAGGTGG GACAGCGGGA 
1651 GAGGAGCAGG CCCCACAGAG GAATCGAGGA TGCCCGGTTC ACGCCAGGTC 
1701 TGCCCCCGGG CAAAGCTACC CCTCCCTTCG CTTGTTACCT CCTCACGTGT 
1751 TCTTGGCATG GCAGAGATTA AAAATGCAAG GAAAAAAATT ACATGCGGAA 
1801 CGGACAAAAT GTTCTCAGAG ATTACTTCAG AAAAAAAAAA GTGAAATGCA 
1851 GATTGTACTT CTTCCTTTAG TGCAGAGACG ACTTTTATTT CCGCCCCCTC 
1901 CCCTCCACAT TCCTGACCTC TCCCTCCCCC TTTTCCCTCT TTCTTTCCTT 
1951 CCTTCCTCCT CTTCCAAGTT CTGGGATTTT TCAGCCTTGC TTGG I I 1 1 GG 
2001 CCAAAAGCAC AAAAAAGGCG TTTTCGGAAG CGACCCGACC GTGCACAAGG 
2051 GCCAI I IGI I TGTTTTGGGA CTCGGGGCAG GAAATCTTGC CCGGCCTGAG 
2101 TCACGGCGGC TCCTTCAAGG AAACGTCAGT GCTCGCCGGT CGCTCTCGTC 
2151 TGCCGCGCGC CCCGCCGCCC GCTGCCCATG GGGGAGATGC AGGGCGCGCT 
2201 GGCCAGAGCC CGGCTCGAGT CCCTGCTGCG GCCCCGCCAC AAAAAGAGGG 
2251 CCGAGGCGCA GAAAAGGAGC GAGTCCTTCC TGCTGAGCGG ACTGGGTAAG 
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2301 CGCCGCCGCC GGCCCCGCTG GGQGOTGGC TCACTTCCCC AGAGCGGCTT 
2351 GGAGGCAGGG GCCGGCTTTC GTCGGAGTTC TCGGGGCCGG GGTCCCGGCG 
2401 GCGGGAACGG GAGGACCTGG CGGGCGAGGT CGCGCGCGCA GGCCTGCGCC 
2451 CCAGGGATAA ACCCCGGAGG GTGGCGCGCA CCGCCGGCTC GGGTTGQGGA 
2501 GGAGGGTGQG AGTCCGGCCG CAGGACGGCG CCTGGCCGGG GAGAGGGTAT 
2551 CTGCAGGGAC AGTGAGCGAA GCCACCGTGG CCGCCGCGCA CCCGCCGGGA 
2601 AGCGCTTCGG CGCTGCGAAC CCGGCTTTCT CCGGCGGCGG AATAAATGAG 
2651 AGAGGTGGAA AACTACCCCG GGCTCTCCGG CCCTCCCCGC GCCCTCCGCC 
2701 GGCGCGTTCT CTCTCTCCTG CCCCAGGAGC CGATGGAGAC TGATAACGGC 
2751 CCTGCGCCAG GCCGTCCCCG GGCGGTCCTC GCGCCCCCGC CCGGGGCTCG 
2801 CCCTCTCAAT GGGGACAGAA CCGCCCGCCG CAGGCAGCGT AGCCGCCAGC 
2851 AAACCGCGAG GCGGTCGGGG CGGGGCGAGG GGCGAGGCGA AGGGCGGGGC 
2901 CACTTCTCAC TGTCGCGCAG GCCCCGCCCC CGCGGCGGTG CLI 1 1 1 1 1 AT 
2951 AAGGCCGAGC GCGCGGCCTG GCGCAGCATA CGCCGAGCCG GTCTTTGAGC 
3001 GCTAACGTCT TTCTGTCTCC CCGCGGTGGT GATGACGGTG AAAACTGAGG 
3051 CTGCTAAGGG CACCCTCACT TACTCCAGGA TGAGGGGGAT GGTGGCAATT 
3101 CTCATCGGTG AGTGCAGGAA TCTTGCGGGA CTTCTGCTCC AGGAGACGCA 
3151 AAGTGGAAAT 1 1 1 I IGAAAG TCCCGGATGA GATTAGTGTG TGTGGCGCCG 
3201 GACGTTATGA AGCCGTCTAA ACGTTTCTTT ATTTCTCGTC CTTCATCCAC 
3251 AGCTTTCATG AAGCAGAGGA GGATGGGTCT GAACGACTTT ATTCAGAAGA 
3301 TTGCCAATAA CTCCTATGCA TGCAAACAGT AAGTTTGACC GGATTTGAGG 
3351 AAATAACTAG TATAGTTTGA ATTTGCCAGC GGTAAACATT CTCATCACGG 
3401 CGTTTATCGG GAAGGCGAAG ACTTCTTCTG GGGTGGGGAT CTCATTTCTC 
3451 CTTAAATTCT AATATATTTG ACACATTTTA AACATTAAAG TTMTTTGCT 
3501 GATTTGGCTT GAACTGGAGA TGTAAGATAA ATGGTTCGTG TTGGCCGAAT 
3551 TCACGGCCTT TCTCCATGAG CAACAATCCT TATTTCTGTA TTTAATGGGG 
3601 TTTATTATTT TCTTTAACTG ACTAATGTAT TGGGGTATTT TCAGTTTAAA 
3651 CAGTGAATTA TCCGGGTAGA AGTCGGTAGA GCCAGGAAAC TCACTTTTGA 
3701 TGTTGGTGTG CCCCCTAGTG GCGAGCTGGA TTCTAAATCG TGCCCTTTAT 
3751 TCCCTGCAGC CCTGAAGTTC AGTCCATCTT GAAGATCTCC CAACCTCAGG 
3801 AGCCTGAGCT TATGAATGCC AACCCTTCTC CTCCAGTAAG I I I I IGTATG 
3851 TGCCGTGCAT CTGTGGAGAA CTGTAAGGGA GTCAGTTAGT ATTCCTACAT 
3901 TAATGGATTA AAATAGCATT TCTAGAAATT AGTATCAAGG CAGGAATGCT 
3951 TCATTATGGC ATAACAAGTG ATATAAATAT TTAAGTATTG AGTCAGAGTA 
4001 TTATTTTATT I I I I ICCTGG GCATATTTTA CCTCCAAAGT GGTTATTTTA 
4051 AAAGGCATAT TTCATAAAAA GGTTTTATCT GTCTGAAACA ACATGACTGT 
4101 GTGCAGTTTC CATACTCATT TGAAATGTGA TGAAATGTAG TTTTGAATGT 
4151 TTATAGATGT ATGGTCATTT GCATCAGTCA TTTGTAGATG TAACATTTTC 
4201 TACATCGTTT ATGTTATAGA TGTCTTCCTT TGAAGCAATG GTATTAAAAG 
4251 AAATTCTTTT I 1 1 1 1 1 1 1 IC TAGCCAAGTC CTTCTCAGCA AATCAACCTT 
4301 GGCCCGTCGT CCAATCCTCA TGCTAAACCA TCTGACTTTC ACTTCTTGAA 
4351 AGTGATCGGA AAGGGCAGTT TTGGAAAGGT AATTTCAAAT CTGAAGATCT 
4401 TTTGGTACAC TTCCTTCATG TCCTCTTTTA TATTCTCCCT GGATGAGGAT 
4451 AGAAAAATGA I I I I I I I AAA TTGAAATTTC AGGTTCTTCT AGCAAGACAC 
4501 AAGGCAGAAG AAGTGTTCTA TGCAGTCAAA GTTTTACAGA AGAAAGCAAT 
4551 CCTGAAAAAG AAAGAGGTAT GAGATGTGCT TGATGGGGCT GGCATTGGCG 
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4601 GTAGACACTC CTTGAATAAT CTTGATTCTG GAATGTTGGT GCCAAGTTGA 
4651 AACATGCCAC TAAATCTGAA TCGTCATTTT CCTAGGAGAA GCATATTATG 
4701 TCGGAGCGGA ATGTTCTGTT GAAGAATGTG AAGCACCCTT TCCTGGTGGG 
4751 CCTTCACTTC TCTTTCCAGA CTGCTGACAA ATTGTACTTT GTCCTAGACT 
4801 ACATTAATGG TGGAGAGGTG AGCAGGGGGG ATAGAAGTCA ACTCTTAGTG 
4851 TCTCTGCACA GCCTGCTTTG TTTTAGTTTG AGAAAAAAGT TTTCAAAGAT 
4901 TTTTGGTGGG GAGAATGTTA CCAGAATTAG CATTTCCTTC AACCTGTCAG 
4951 GTTTATAGTT AATAGATTAC TTGGGGCCAC TTCCTGCAGT TGTTCTTTTG 
5001 CTGTGTATGT CAAAACTAAT TAAATTCATT TGCAACCCAG AATGACTTTG 
5051 TTCTGTCTCC TGCAGTTGTT CTACCATCTC CAGAGGGAAC GCTGCTTCCT 
5101 GGAACCACGG GCTCGTTTCT ATGCTGCTGA AATAGCCAGT GCCTTGGGCT 
5151 ACCTGCATTC ACTGAACATC GTTTATAGGT AAGCCTGAGA GCTCTTCAGG 
5201 CTACCAGTTT TGGTATAAAG GAGACGTAGC ACTGGGTGTT TCATAGGGCC 
5251 TTAAAATAAT TTGTGTTTAT TTGCAACTTG GTTGCCTAAA ACCAGATCCC 
5301 CTAGCACGTG AGCTGGCTTG ACTTAAGTGC CAAGGGGGAA CCAGCCAAGT 
5351 AGGATTGTGC CTAATCCAGA ATAGATGAGC AGAACAAGGG CTCCU 1 1 1 I 
5401 TCTTCACTAC ACAACTACAG TGAACCTAAA ATGCGTGTAA TACCTTTAGC 
5451 AATTATCTTT AAGAGGATAT CTTATGAAGT GAAATTAAGT TGTGCAACTA 
5501 CTTTTCTATT CAG MIMA CAGAGACTTA AAACOAGAGA ATATTTTGCT 
5551 AGATTCACAG GGACACATTG TCCTTACTGA CTTCGGACTC TGCAAGGAGA 
5601 ACATTGAACA CAACAGCACA ACATCCACCT TCTGTGGCAC GCCGGAGGTA 
5651 GGCGCTGTCT TGGTTTGGTG CCTGGTTTAC CCCCGCCTTC CAAGAGAGAG 
5701 ATGTACAATC ATGCACTTAA CTACCAAAAA GAGTAAACTC CTCTCAGAGA 
5751 CTTCTTAATA CAGTTCAGTG CAAATAAAAT ACATTTGGTG TTTGATGTAG 
5801 CATGAGAAAT CCCAAGTCCT TCTGTTCCTT TACTGAAAAG TAGCTGTTTG 
5851 TAAGTAAGAT CTGCATCATA AAAACTTTCT AAATCCCTAA GTAAGAGATA 
5901 TCAAGTGCCC AGCAGTTTCC TAAATGTCAG TACACATAGG TAGCCAGTCA 
5951 CCCTCAAAAA GTCCAGCAGT TTTATCAGGA AGGAATCTAA AGATATCTAT 
6001 CTTCCAAGCT GGCTCTGGGT CTCTCAGCTT TTTCAAACTA AATGTGTGGT 
6051 CGTGGGATTG CTTGCTTTCG CAGGTTCTAA ACGCTGTTTC CCTGGTCTGT 
6101 TTTTCAGTAT CTCGCACCTG AGGTGCTTCA TAAGCAGCCT TATGACAGGA 
6151 CTGTGGACTG GTGGTGCCTG GGAGCTGTCT TGTATGAGAT GCTGTATGGC 
6201 CTGGTGAGTG GCACATTGGG AACCATGGAA CACTGCCTGC TCCCTACAAT 
6251 ATTGCCTTCA CACAGCCCAT GCTTGGCCAT GGTGTCTTGC CCTTACCAGT 
6301 ACGCTTATCA AAAGCAGCTA AGAGGCATAT TGGTTATTTT ATAGTTCATA 
6351 AGAATAATCA CTTACCTGGT TCTTTTGTGC ATTTCACATT TTACTAGATA 
6401 GGACCACATT GAACCTGTGT GGTGGTGAAA AACTACCACT TATTAACATC 
6451 TACCCCCTCA CCCTCCACAC ACACACACAC AAACACACAC ACGGGTTGCA 
6501 AAGTAGACAC TTAAATAGCA AGGGAAAAGA AAGCATTGAG GTGGGGAGAG 
6551 TTTCTCAAAT CGAGCCTAAT ATTTATTGCC GTTTATATCT TTTTCTCTAC 
6601 TGGTAATGTG TGCCATATGA AACTTCCAAT TAAGTCTAAA GTAATTTTCC 
6651 CCTTCT7TCA GCCGCU I I I TATAGCCGAA ACACAGCTGA AATGTACGAC 
6701 AACATTCTGA ACAAGCCTCT CCAGCTGAAA CCAAATATTA CAAATTCCGC 
6751 AAGACACCTC CTGGAGGGCC TCCTGCAGAA GGACAGGACA AAGCGGCTCG 
6801 GGGCCAAGGA TGACTTCGTG AGTGATGTTT TCCTGTCCTC CTGGGCCGGC 
6851 CGGGACGTGC ACTAGACCTC CCTGCCCTTA TTGAATGCAC CTGTCTAAAT 
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6901 TAATCTTGGG 1 1 ILI IATCA ACAGATGGAG ATTAAGAGTC ATGTCTTCTT 
6951 CrCOTAATT AACTGGGATG ATCTCATTAA TAAGAAGATT ACTCCCCCTT 
7001 TTAACCCAAA TGTGGTGAGT ATCTGTCTCT CTTCTAAGTA TAGAGAAGCC 
7051 CAAAGGGCAT TTATTTTAAT TCAGAATTGT CTGGGGGAGG GTTGGAAGGA 
7101 ATACATTGGC AGATGTTTTC TCCATAAACC TGTTATTTTA CCTACATAAA 
7151 AAGCACATTT TTGTGTCCCA ACAAGGCTCC CATAAI 1 1 1 1 AGACACATTT 
7201 ATCAATTCGA AGCACCAAAA GGCAACAAGT GAACATTATT CTTATGTTTA 
7251 ACTGTGTGTA GCCTTTTGAG ATTTTGTGCT TGAAGTGGGT GATTATGGAA 
7301 GTTGATATAA GACTTAAACT TGGTATTTAA AGCCTGGTCA AGATTTCCCT 
7351 GTCCTGTGTC TAGTGTGAGT TCTTGACAAG AGTGTTTTTC CCTTCCCGTC 
7401 ACAGAGTGGG CCCAACGACC TACGGCACTT TGACCCCGAG TTTACCGAAG 
7451 AGCCTGTCCC CAACTCCATT GGCAAGTCCC CTGACAGCGT CCTCGTCACA 
7501 GCCAGCGTCA AGGAAGCTGC CGAGGCTTTC CTAGGCTTTT CCTATGCGCC 
7551 TCCCACGGAC TCTTTCCTCT GAACCCTGTT AGGGCTTGGT TTTAAAGGAT 
7601 TTTATGTGTG TTTCCGAATG TTTTAGTTAG CCTTTTGGTG GAGCCGCCAG 
7651 CTGACAGGAC ATCTTACAAG AGAATTTGCA CATCTGTGGA AGCTTAGCAA 
7701 TCTTATTGGA CACTGTTCGC TGGAAGCTTT TTGAAGAGGA CATTCTCCTC 
7751 AGTGAGCTCA TGAGGnnTC ATTTTTATTC 1TGGTTCCAA GGTGGTGCTA 
7801 TCTCTGAAAC GAGCGTTAGA GTGCCGCCTT AGACGGAGGG AGGAGTTTCG 
7851 TTAGAAAGCG GACGCTGTTC TAAAAAAGGT CTCCTGCAGA TCTGTCTGGG 
7901 CTGTGATGAC GAATATTATG AAATGTGCCT TTTCTGAAGA GATTGTGTTA 
7951 GCTCCAAAGC TTTTCCTATC GCAGTGTTTC AG 1 1 CI 1 1 AT TTTCCCTTGT 
8001 GGATATGCTG TGTGAACCGT CGTGTGAGTG TGGTATGCCT GATCACAGAT 
8051 GGAI 1 1 IGI I ATAAGCATCA ATGTGACACT TGCAGGACAC TACAACGTGG 
8101 GACATTGTTT Gl I ILI I CCA TATTTGGAAG ATAAATTTAT GTGTAGACTT 
8151 TTTTGTAAGA TACGGTTAAT AACTAAAATT TATTGAAATG GTCTTGCAAT 
8201 GACTCGTATT CAGATGCTTA AAGAAAGCAT TGCTGCTACA AATATTTCT A 
8251 TTTTTAGAAA GGGI 1 1 1 IAT GGACCAATGC CCCAGTTGTC AGTCAGAGCC 
8301 GTTGGTGTTT TTCATTGTTT AAAATGTCAC CTGTAAAATG GGCATTATTT 
8351 ATGI 1 1 I 1 1 I TTTTGCATTC CTGATAATTG TATGTATTGT ATAAAGAACG 
8401 TCTGTACATT GGGTTATAAC ACTAGTATAT TTAAACTTAC AGGCTTATTT 
8451 GTAATGTAAA CCACCATTTT AATGTACTGT AATTAACATG GTTATAATAC 
8501 GTACAATCCT TCCCTCATCC CATCACACAA LI I 1 1 I I IGT GTGTGATAAA 
8551 CTGATTTTGG TTTGCAATAA AACCTTGAAA AATATTTACA TATATTGTGT 
8601 CATGTGTTAT TTTGTATATT TTGGTTAAGG GGGTAATCAT GGGTTAGTTT 
8651 AAAATTGAAA ACCATGAAAA TCCTGCTGTA ATTTCCTGCT TAGTGGTTTG 
8701 CTCCCAACAG CAGTGGTTTC TGACTCCAGG GGAGTATAGG ATGGTCTTAA 
8751 AGCCAACCTA CGTTCCAGGC LI I I I IAGCA GCATTTTATG GTGTCTGTCA 
8801 TTCATAAATC CATCCAAGGA AATCCTTTGC AATTTACTCA TCTTGCAAGG 
8851 ATTGCTATGA AGTAATGCTT CCTGTATTTA TTGCCTGTCC TGTGAAGTTG 
8901 GACTATTTGT CCTGACATTT GGCTTGTCTT CAGTTACAGG TAATTCTTTC 
8951 CAGAAATATT TGAAAGCCTA CTCTGGGCTC TATTGCGAGT GCTCAGGATA 
9001 TCGTAGTGGA CAAAGCAGAC AACTTCGCCC TTCCAGAGCC TGATGAAGAA 
9051 GGCCGACCTA AAGCAGTTAG TTGAGATGGA AATTGAGAAA TAGTCTGTGA 
9101 AGTTTAGGAG AATGCCACAC AAGAGGGTGA GAAI I I 1 1 I I I I 1 1 I I I I I I 
9151 I 1 1 1 1 I I I IG AGACACGGTC TTACTCTGTC GCCCAGGCTG GAGTGCAGTG 
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9201 GTGTGATCTT GGTTCACTGC AGCCTCCGCC TCCTGGGTTC ATGTGATCCC 
9251 CCCATCTCAG CCTCCTGAGT AGCTQQGACT ACAGGCATGC ACCACCATGC 
9301 CTGGCTAATT TTTGTATTTT AGTAGAGATG GGATTTCACC ATGTGGGCCA 
9351 GGCTGGTCTC GAATCCCTGG CCTCAAGTGA TCTGTCTGCC TCGGCTTCCC 
9401 TAAGTGCTGG GGAGAATGTT TTAAATAAGT QGATATGTTC CCAAAAAGCT 
9451 GACCTGGCTG GGACATCTGG TTTCTGAGAG TACCTGGAGT TGACCCAGGT 
9501 CTAGAGTGAG CTCAGTAAAG GGACCCTGAA GGAGCTCATC CCTAGCTTGG 
9551 ACTGAAGCTT CTTGAGCCAG TGTCTACCTA GCACCCTAAG GGCCCAGCAG 
9601 GCTCTGGGGC TGTGTGGCAG AGCCCACTCC TAGAGCTCAC CCCACTGTGA 
9651 TATTACCTGT GGGAGAAAGC GAGGTGGCAC CATCCTTGGA GATCTTGAGT 
9701 CCAAAGGTTT GGACuTTTC ACTCTTCTAG GCCTTCCACA CAAATACTTA 
9751 ACAAATAATC AGGGAATCCC CAAACAGTTG ATGTTGCTGC TGCCTTAATT 
9801 GCAAAAGCAC CCTGTAGGCC TGCTGCACCC CCGCTACCCT GACCTTCCAG 
9851 TTCGCACAGG GATTTCCCCA AGGGAAAGCT GTGAGCTTTT TTCCTCTTAT 
9901 CCTTGCTCTT GGGTGTCACC TCACTTTGCC TCAGTCCCCC TCTCCTACCC 
9951 CACAAGGTTT CCAAGGGCCA AACAGGTGTT CAGAGATAAC CGAGTTCTTC 
10001 TCCCTCATGA TCTAATGAAG GAAGAAGATG AAAACGAGTC GATAGCTTTT 
10051 TGCTCAAGGT GGGCCACCGG TCATGGTCTG CTGTTGAOT; AGTGCTCTAC 
10101 AGGGATTAGC TACGTGTTCA ATTCCCTACC GGGCCCAGTT ; GACAAATAAA 
10151 GAGTCCAAAG CAAGGCCAGG CACGGTGGCT CACGCTTGTA ATCCCAGCAC 
10201 TTTGGGAGGC CGAGGCGGGC AGATCACGAG GTCAGGAGAT CGAGACGATC 
10251 CTGGCTAACA TGGTGAAACC CCGTCTCTAC TAAAAATACA AAAAAATTAG 
10301 CCGGGCGTGG TGGTGGGCGC CTGTAGTCCC AGCTACTCGG GAGGCTGAGG 
10351 GAGGAGAATG GCGTGAACCA GGGAGGCGGA GCTTGCAGTG AGCGGAGATC 
10401 GCACCACTGC ACTCCAGCCT GGGCGACAGA GCAAGACTCT GTCTCAAAAA 
10451 ACAAAACAAA ACAAAAGCAT GTATTTTCCT ATTAAAGATT GATGCCGGCT 
10501 CTAACATAGA GACTCATTGC ATATTCCCCC TCATTCTCAT TCTCAATAAC 
10551 AGTTATGAAT TCCTCCTCGA ACA (SEQ ID NO: 3) 
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CHROMOSOME MAP POSITION: 

chromosome 6 



ALLELIC VARIANTS (SNPs) : 
DNA 

Position Major Minor Domain 
738 A G intron 

Context: 

DNA 

Position 

738 GACATACGGCTGGIACTGGCMTM 

MGCAGACATGGCKXAGTTACTGTGAT^ 

GTTTGTT1 1 GAGACGGAGTCTCGCTCTGTTGCCAGGCTGGAGTGCAGTGGCGTGATCTCG 

GCTCACTGCMCCTCCGCCTCCCGGGTT^ 

[A,G] 

CTGGGACTACAGGCGCACGCCACCACGCCTGGCTAA 1 1 1 1 1 CTATTTTCAGTAGAGACGG 

GGTTTCACCATGTTAGCCAGGATGGTCTCGATCTCTTGACCTCGTGATCCG^ 

GCCTCCOWkGTGCTGGGATTACAGGCGTGAGCCACTGCGCCO 

TTTATAAGTGTGGGCACTGAGCAMCTTrCCCAGCCAGACTCCA^^ 

TCCCnXTCTCGGTTTGGGGCTGTTGCAACAMGCAMCCM 
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